在 Kubernetes 中,我們經常地使用第三方的 plugin 來整合 AWS 服務,而在 EKS 上也提供對應的 plugin 版本,如 VPC CNI plugin、CoreDNS、kube-proxy、EBS CSI driver 等 plugins。而 EKS add-on 功能則提供經管理的 add-on 包含安全性更新、修復漏洞皆是由 EKS add-on 功能所管理並進行更新。
故本文將探討為何 EKS 環境可以主動更新已經部署的 Kubernetes plugin,那我自己所設定的環境變數為什麼不會被修改。
以下將原先 VPC CNI plugin 更新使用 EKS add-on 作為範例:
aws-node
所使用關聯的 IRSA(IAM roles for service accounts) [2] IAM role arn。$ kubectl -n kube-system get sa aws-node -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::111111111111:role/eksctl-ironman-2022-addon-iamserviceaccount-kube-Role1-EMRZZYBVKCRL
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"aws-node","namespace":"kube-system"}}
creationTimestamp: "2022-09-14T09:46:16Z"
labels:
app.kubernetes.io/managed-by: eksctl
name: aws-node
namespace: kube-system
resourceVersion: "1657"
uid: 10f55f1c-cff2-47cd-96e0-3ca6d37e69ce
secrets:
- name: aws-node-token-9z9wq
v1.10.1
。$ kubectl -n kube-system get ds aws-node -o yaml | grep "image: "
image: 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni:v1.10.1-eksbuild.1
image: 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni-init:v1.10.1-eksbuild.1
eksctl utils describe-addon-versions
命令檢視,目前 EKS cluster 所支援的 add-on 版本。以下擷取 vpc-cni
最新版本為 v1.11.4-eksbuild.1
。$ eksctl utils describe-addon-versions --cluster=ironman-2022 | tail -n 1 | jq .
...
...
...
{
"AddonName": "vpc-cni",
"AddonVersions": [
{
"AddonVersion": "v1.11.4-eksbuild.1",
"Architecture": [
"amd64",
"arm64"
],
"Compatibilities": [
{
"ClusterVersion": "1.22",
"DefaultVersion": false,
"PlatformVersions": [
"*"
]
}
]
},
eksctl create addon
命令更新 VPC CNI plugin 至 v1.11.4-eksbuild.1
並綁定對應 Service Account IAM role ARN。$ eksctl create addon --name vpc-cni --version v1.11.4-eksbuild.1 --service-account-role-a
rn="arn:aws:iam::111111111111:role/eksctl-ironman-2022-addon-iamserviceaccount-kube-Role1-EMRZZYBVKCRL" --cluster=ironman-2022
2022-10-10 11:22:30 [ℹ] Kubernetes version "1.22" in use by cluster "ironman-2022"
2022-10-10 11:22:30 [ℹ] when creating an addon to replace an existing application, e.g. CoreDNS, kube-proxy & VPC-CNI the --force
flag will ensure the currently deployed configuration is replaced
2022-10-10 11:22:30 [ℹ] using provided ServiceAccountRoleARN "arn:aws:iam::111111111111:role/eksctl-ironman-2022-addon-iamserviceaccou
nt-kube-Role1-EMRZZYBVKCRL"
2022-10-10 11:22:30 [ℹ] creating addon
Error: addon status transitioned to "CREATE_FAILED"
由上述錯誤資訊,可以觀察到 addon 建立失敗 CREATE_FAILED
。
5. 嘗試使用 --force
參數強制更新。
$ eksctl create addon --name vpc-cni --version v1.11.4-eksbuild.1 --service-account-role-arn="arn:aws:iam::111111111111:role/eksctl-ironman-2022-addon-iamserviceaccount-kube-Role1-EMRZZYBVKCRL" --cluster=ironman-2022 --force
2022-10-10 11:24:02 [ℹ] Kubernetes version "1.22" in use by cluster "ironman-2022"
2022-10-10 11:24:02 [ℹ] using provided ServiceAccountRoleARN "arn:aws:iam::111111111111:role/eksctl-ironman-2022-addon-iamserviceaccount-kube-Role1-EMRZZYBVKCRL"
2022-10-10 11:24:02 [ℹ] creating addon
2022-10-10 11:27:32 [ℹ] addon "vpc-cni" active
v1.11.4-eksbuild.1
版本。$ kubectl -n kube-system get ds aws-node -o yaml | grep "image: "
image: 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni:v1.11.4-eksbuild.1
image: 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni-init:v1.11.4-eksbuild.1
不免俗過往套路,有什麼更新先找 audit log 來一探究竟,使用以下 CloudWatch Logs Insights syntax 查看 DaemonSet aws-node
更新:
filter @logStream like /^kube-apiserver-audit/
| fields @timestamp, @message
| sort @timestamp asc
| filter objectRef.name == 'aws-node' AND objectRef.resource == 'daemonsets' AND verb == 'patch'
| limit 10000
第一筆為透過 eksctl
命令第一次更新失敗的資訊,其失敗原因為 .spec.template.spec.containers
及 .spec.template.spec.initContainers
欄位 FieldManagerConflict :
"@message": {
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "b7e38710-2b5c-4317-a899-7a8b0f603b80",
"stage": "ResponseComplete",
"requestURI": "/apis/apps/v1/namespaces/kube-system/daemonsets/aws-node?dryRun=All&fieldManager=eks&force=false&timeout=10s",
"verb": "patch",
"user": {
"username": "eks:addon-manager",
"uid": "aws-iam-authenticator:348370419699:AROAVCHD6X7Z4UOOTWH7N",
"groups": [
"system:authenticated"
],
...
"userAgent": "Go-http-client/1.1",
"objectRef": {
"resource": "daemonsets",
"namespace": "kube-system",
"name": "aws-node",
"apiGroup": "apps",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"status": "Failure",
"reason": "Conflict",
"code": 409
},
...
...
...
"responseObject": {
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "Apply failed with 2 conflicts: conflicts with \"kubectl-client-side-apply\" using apps/v1:\n- .spec.template.spec.containers[name=\"aws-node\"].image\n- .spec.template.spec.initContainers[name=\"aws-vpc-cni-init\"].image",
"reason": "Conflict",
"details": {
"causes": [
{
"reason": "FieldManagerConflict",
"message": "conflict with \"kubectl-client-side-apply\" using apps/v1",
"field": ".spec.template.spec.containers[name=\"aws-node\"].image"
},
{
"reason": "FieldManagerConflict",
"message": "conflict with \"kubectl-client-side-apply\" using apps/v1",
"field": ".spec.template.spec.initContainers[name=\"aws-vpc-cni-init\"].image"
}
]
},
"code": 409
},
"requestReceivedTimestamp": "2022-10-10T11:22:31.607660Z",
"stageTimestamp": "2022-10-10T11:22:31.613812Z",
第二筆為透過 --forece
成功更新,可以觀察到:
requestURI
:使用了 force=true
responseObject
:多了 managedFields 欄位,其 manager 為 eks。 "@message": {
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "40e4c444-65f0-49a8-8efa-f019a0e38508",
"stage": "ResponseComplete",
"requestURI": "/apis/apps/v1/namespaces/kube-system/daemonsets/aws-node?fieldManager=eks&force=true&timeout=10s",
"verb": "patch",
"user": {
"username": "eks:addon-manager",
"uid": "aws-iam-authenticator:348370419699:AROAVCHD6X7Z4UOOTWH7N",
"groups": [
"system:authenticated"
],
...
...
...
"userAgent": "Go-http-client/1.1",
"objectRef": {
"resource": "daemonsets",
"namespace": "kube-system",
"name": "aws-node",
"apiGroup": "apps",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 200
},
...
...
"responseObject": {
"kind": "DaemonSet",
"apiVersion": "apps/v1",
"metadata": {
"name": "aws-node",
"namespace": "kube-system",
"uid": "19ae2e66-9bf4-4d21-9161-c0f564a62b34",
"resourceVersion": "7590998",
"generation": 8,
"creationTimestamp": "2022-09-14T09:46:16Z",
"labels": {
"k8s-app": "aws-node"
},
...
...
...
"managedFields": [
{
"manager": "eks",
"operation": "Apply",
"apiVersion": "apps/v1",
"time": "2022-10-10T11:24:03Z",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:labels": {
"f:k8s-app": {}
}
},
"f:spec": {
"f:selector": {},
"f:template": {
"f:metadata": {
"f:labels": {
"f:app.kubernetes.io/name": {},
"f:k8s-app": {}
}
},
"f:spec": {
"f:affinity": {
"f:nodeAffinity": {
"f:requiredDuringSchedulingIgnoredDuringExecution": {}
}
},
"f:containers": {
"k:{\"name\":\"aws-node\"}": {
".": {},
"f:env": {
"k:{\"name\":\"AWS_VPC_K8S_CNI_CONFIGURE_RPFILTER\"}": {
".": {},
"f:name": {},
"f:value": {}
},
...
...
故我們也能使用 kubectl
命令 --show-managed-fields
參數也能觀察 Field management。
$ kubectl -n kube-system get ds aws-node --show-managed-fields -o yaml | grep -A5 "managedFields"
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
根據 EKS add-on [1] 及 blog [3] 皆提及 EKS add-on 功能使用 Kubernetes 1.18 feature - Server-Side Apply [4]。此 Server-Side Apply 功能將 kubectl apply
功能移植到 Kubernetes API server side,當設定檔欄位有所衝突時提供對應解決衝突的合併演算法。
為追蹤管理 Field Management 則被處存為 metadata object managedFields
[5] 欄位。
回到 EKS 如何使用 Field Management [6] 可以分為以下兩類型:
f:
但沒有 k:
,則代表 Fully managed,任何修改都會造成衝突。以下以 aws-node
為例,此 container name aws-node
是由 EKS 所管理,其中 args
及 image
及 imagePullPolicy
sub-fields 皆是由 EKS 所管理。修改任何數值於這些欄位都會造成衝突。因此在上述第一次啟用 EKS addon 時,其中 .spec.template.spec.containers
及 .spec.template.spec.initContainers
衝突而失敗。
f:containers:
k:{"name":"aws-node"}:
.: {}
f:args: {}
f:image: {}
f:imagePullPolicy: {}
...
manager: eks
以下以 aws-node
為例環境變數為例,eks 管理包含以 ADDITIONAL_ENI_TAGS
、AWS_VPC_CNI_NODE_PORT_SUPPORT
... 為 key 的環境變數。因此修改環境變數名稱時則會造成衝突,故我們仍可以保留 EKS add-on 的環境變數值。
f:containers:
k:{"name":"aws-node"}:
.: {}
f:env:
.: {}
k:{"name":"ADDITIONAL_ENI_TAGS"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"AWS_VPC_CNI_NODE_PORT_SUPPORT"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"AWS_VPC_ENI_MTU"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"AWS_VPC_K8S_CNI_CONFIGURE_RPFILTER"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"AWS_VPC_K8S_CNI_EXTERNALSNAT"}:
.: {}
f:name: {}
f:value: {}
EKS add-on 功能主要導入 Kubernetes Server-Side Apply,雖然由 AWS 來維護安全性漏洞或是版本上更新較為方便。若是需要自定義時,仍需要對照 Field Management 欄位。如同 CoreDNS 在部分使用場境會設定為 DaemonSet 方式部署,則可能無法藉由 add-on 方式進行。